Prominence based scoring of speech segments for automatic speech-to-speech summarization

نویسندگان

  • Sree Harsha Yella
  • Vasudeva Varma
  • Kishore Prahallad
چکیده

In order to perform speech summmarization it is necessary to identify important segments in speech signal. The importance of a speech segment can be effectively determined by using infomation from lexical and prosodic features. Standard speech summarization systems depend on ASR transcripts or gold standard human reference summaries to train a supervised system which combines lexical and prosodic features to choose a segment into summary. We propose a method which uses prominence values of syllables in a speech segment to rank the segment for summarization. The proposed method does not depend on ASR transcripts or gold standard human summaries. Evaluation results showed that summaries generated by the proposed method are as good as the summaries generated using tf*idf scores and supervised system trained on gold standard summaries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مقایسه روش‌های مختلف یادگیری ماشین در خلاصه‌سازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت

In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...

متن کامل

Speech Summarization Methods Using Speaker Tracking and Prominence Based Ranking

Automatic speech summarization is the task of generating a concise summary of a speech signal using a digital computer. The existing speech summarization systems rely on automatic speech recognition (ASR) transcripts and gold standard human summaries to generate summaries of speech signals. The limitations with these approaches are, ASR errors make summaries less usable by humans, also ASR syst...

متن کامل

The Prosody of Discourse Structure and Content in the Production of Persian EFL Learners

The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Improved speech summarization with multiple-hypothesis representations and kullback-leibler divergence measures

Imperfect speech recognition often leads to degraded performance when leveraging existing text-based methods for speech summarization. To alleviate this problem, this paper investigates various ways to robustly represent the recognition hypotheses of spoken documents beyond the top scoring ones. Moreover, a new summarization method stemming from the Kullback-Leibler (KL) divergence measure and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010